Skip to content

Conversation

ericallam
Copy link
Member

@ericallam ericallam commented Sep 9, 2025

This PR improves the experience when receiving certain types of OOM issues in production tasks, because currently we only really handle OOMs when the task run (worker) process heap exceeds the --max-old-space-size setting. If the RUN_ENGINE_TREAT_PRODUCTION_EXECUTION_STALLS_AS_OOM=1 env var is set, heartbeat stalls when runs are executed will be treated as OOM errors and our "OOM Retry" mechanism will be activated, if the task is configured to retry OOM errors on larger machines. I've also updated the copy of the OOM error to be more accurate:

CleanShot 2025-09-12 at 14 22 11

This PR also will align dev run heartbeat stalls with production runs, instead of just cancelling runs. Dev runs that stall while executing will now have the following error:

CleanShot 2025-09-12 at 14 24 29

If stalls are indeed happening because of CPU-heavy work and heartbeats are unable to be sent because of a blocked event loop, we have a new utility to allow users to "yield" and allow the run the continue executing:

import { heartbeats } from "@trigger.dev/sdk";

// code inside your task
for (const row of bigDataset) {
  await heartbeats.yield(); // safe to call every iteration, we will only actually yield when we need to
  process(row); // this is a synchronous operation
}

Copy link

changeset-bot bot commented Sep 9, 2025

🦋 Changeset detected

Latest commit: 1706618

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 23 packages
Name Type
@trigger.dev/sdk Patch
trigger.dev Patch
@trigger.dev/python Patch
d3-chat Patch
references-d3-openai-agents Patch
references-nextjs-realtime Patch
@trigger.dev/build Patch
@trigger.dev/core Patch
@trigger.dev/react-hooks Patch
@trigger.dev/redis-worker Patch
@trigger.dev/rsc Patch
@trigger.dev/schema-to-json Patch
@trigger.dev/database Patch
@trigger.dev/otlp-importer Patch
@internal/cache Patch
@internal/clickhouse Patch
@internal/redis Patch
@internal/replication Patch
@internal/run-engine Patch
@internal/schedule-engine Patch
@internal/testcontainers Patch
@internal/tracing Patch
@internal/zod-worker Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

coderabbitai bot commented Sep 9, 2025

Warning

Rate limit exceeded

@ericallam has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 6 minutes and 35 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between d1c51c0 and 1706618.

📒 Files selected for processing (1)
  • .changeset/soft-candles-grow.md (1 hunks)

Walkthrough

Adds a heartbeats subsystem (HeartbeatsAPI, HeartbeatsManager interface, StandardHeartbeatsManager) in core, exposes it via v3 exports and the SDK (yield, getLastHeartbeat), and replaces interval-based worker heartbeats with event-driven heartbeats in CLI workers. Run-engine changes: new option treatProductionExecutionStallsAsOOM, runStatusFromError now requires environmentType and maps stall errors differently, increased default execution timeouts, guards to prevent operating on FINISHED snapshots, and permanently-fail flows updated to use latest snapshot objects. Documentation updates: troubleshooting for stalled executions and expanded machines/ResourceMonitor guidance.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Pre-merge checks (1 passed, 1 warning, 1 inconclusive)

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 22.22% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Description Check ❓ Inconclusive I cannot find the current pull request description in the provided input, so I cannot verify whether it follows the repository's required PR description template or contains the necessary sections and details; therefore the check is inconclusive. Please add or paste the PR description that follows the repository template including a short summary, motivation, design/implementation details, list of affected modules/files, any new environment variables or config flags (e.g., RUN_ENGINE_TREAT_PRODUCTION_EXECUTION_STALLS_AS_OOM and updated timeouts), testing steps, rollout/migration notes, documentation updates, and any backwards-compatibility impacts so the template compliance and content completeness can be verified.
✅ Passed checks (1 passed)
Check name Status Explanation
Title Check ✅ Passed The title accurately captures the primary changes in the changeset — improving execution-stalls troubleshooting, aligning development and production stall behavior, and adding a heartbeats.yield utility — and is a single, clear sentence that reflects the main areas touched by the diff.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ea-branch-87

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
packages/core/src/v3/errors.ts (1)

306-310: Typo in error code literal: “TASK_HAS_N0_EXECUTION_SNAPSHOT”.

The “0” digit will never match the real code (“NO”), breaking classification and falling into assertExhaustive. Fix the literal.

Apply:

-        case "TASK_HAS_N0_EXECUTION_SNAPSHOT":
+        case "TASK_HAS_NO_EXECUTION_SNAPSHOT":
internal-packages/run-engine/src/engine/errors.ts (1)

55-55: Typo in error code literal: “TASK_HAS_N0_EXECUTION_SNAPSHOT”.

Same issue as in core errors; will never match the real code.

Apply:

-    case "TASK_HAS_N0_EXECUTION_SNAPSHOT":
+    case "TASK_HAS_NO_EXECUTION_SNAPSHOT":
🧹 Nitpick comments (19)
apps/webapp/app/env.server.ts (1)

522-523: 10-minute stall timeout defaults: verify operational alignment

Confirm dashboards/alarms and docs consistently reflect 10 minutes, and that staging/preview have overrides if needed.

packages/cli-v3/src/executions/taskRunProcess.ts (3)

112-116: Don't silently swallow cancel errors

Swallowing here hides unexpected failures to deliver CANCEL. Log at debug.

     try {
       await this.#cancel();
-    } catch (err) {}
+    } catch (err) {
+      logger.debug("cancel: failed to send CANCEL to child", { err, pid: this.pid });
+    }

126-127: Capture and log flush errors during cleanup

Keep cleanup resilient but log failures for diagnosis.

-    await tryCatch(this.#flush());
+    {
+      const [err] = await tryCatch(this.#flush());
+      if (err) logger.debug("cleanup: flush failed", { err, pid: this.pid });
+    }

442-443: Capture and log flush errors during suspend

Same rationale as cleanup.

-      await tryCatch(this.#flush());
+      {
+        const [err] = await tryCatch(this.#flush());
+        if (err) logger.debug("suspend: flush failed", { err, pid: this.pid });
+      }
docs/troubleshooting.mdx (1)

184-215: Tiny enhancement: mention env knobs for the 10‑min timeout

Consider adding: “You can configure the stall timeout via RUN_ENGINE_TIMEOUT_EXECUTING and RUN_ENGINE_TIMEOUT_EXECUTING_WITH_WAITPOINTS.”

packages/cli-v3/src/entryPoints/dev-run-controller.ts (2)

360-369: FINISHED handling: detach handlers before suspend (reduce race window)

Small window where a heartbeat could still fire. Detach first.

-          await this.exitTaskRunProcessWithoutFailingRun({
+          this.taskRunProcess?.unsafeDetachEvtHandlers();
+          await this.exitTaskRunProcessWithoutFailingRun({
             flush: true,
             reason: "already-finished",
           });

620-643: Don’t await heartbeat I/O in the event handler

Fire-and-forget to avoid backpressure if multiple heartbeats cluster.

-      const response = await this.httpClient.dev.heartbeatRun(
+      void this.httpClient.dev.heartbeatRun(
         this.runFriendlyId,
         this.snapshotFriendlyId,
         {
           cpu: 0,
           memory: 0,
         }
-      );
-
-      if (!response.success) {
-        logger.debug("[DevRunController] Heartbeat failed", { error: response.error });
-      }
+      ).then((response) => {
+        if (!response.success) {
+          logger.debug("[DevRunController] Heartbeat failed", { error: response.error });
+        }
+      });
packages/core/src/v3/heartbeats/types.ts (1)

1-8: Prefer type alias over interface per repo guidelines.

Functionally equivalent here and aligns with “we usually use types over interfaces”.

Apply:

-export interface HeartbeatsManager {
-  startHeartbeat(id: string): void;
-  stopHeartbeat(): void;
-  yield(): Promise<void>;
-  reset(): void;
-
-  get lastHeartbeat(): Date | undefined;
-}
+export type HeartbeatsManager = {
+  startHeartbeat(id: string): void;
+  stopHeartbeat(): void;
+  yield(): Promise<void>;
+  reset(): void;
+  readonly lastHeartbeat: Date | undefined;
+};
internal-packages/run-engine/src/engine/errors.ts (1)

29-34: Avoid magic string comparison for environment.

Use the exported constant if available, or a typed constant in-file to prevent typos.

Apply:

+const DEV: RuntimeEnvironmentType = "DEVELOPMENT";
-      if (environmentType === "DEVELOPMENT") {
+      if (environmentType === DEV) {
         return "CANCELED";
       }
packages/trigger-sdk/src/v3/heartbeats.ts (1)

37-39: Naming nit: prefer lastHeartbeat getter to mirror core

To keep SDK and core consistent, consider exporting a getter-style property rather than a getLastHeartbeat() function, e.g., lastHeartbeat(): Date | undefined or re-exporting the value via a getter. Not critical.

-export const heartbeats = {
-  yield: heartbeatsYield,
-  getLastHeartbeat: heartbeatsGetLastHeartbeat,
-};
+export const heartbeats = {
+  yield: heartbeatsYield,
+  get lastHeartbeat() {
+    return heartbeatsGetLastHeartbeat();
+  },
+};
packages/cli-v3/src/entryPoints/managed-run-worker.ts (1)

33-34: Import attemptKey to align heartbeat IDs with dev worker

If we standardize on attempt-level IDs, bring in attemptKey.

   traceContext,
+  attemptKey,
   heartbeats,
 } from "@trigger.dev/core/v3";
internal-packages/run-engine/src/engine/index.ts (2)

216-217: Use Logger instead of console.log for startup message

Prefer the project Logger for consistency and log routing.

-      console.log("✅ Starting run engine worker");
+      this.logger.info("✅ Starting run engine worker");

1253-1265: Avoid hardcoding “sent every 30s” in stall messages

Heartbeat cadence may vary across environments. Remove the fixed interval or derive it from config.

-              ? `Run timed out after ${timeoutDuration} due to missing heartbeats (sent every 30s). Check if your \`trigger.dev dev\` CLI is still running, or if CPU-heavy work is blocking the main thread.`
-              : `Run timed out after ${timeoutDuration} due to missing heartbeats (sent every 30s). This typically happens when CPU-heavy work blocks the main thread.`;
+              ? `Run timed out after ${timeoutDuration} due to missing heartbeats. Check if your \`trigger.dev dev\` CLI is still running, or if CPU-heavy work is blocking the main thread.`
+              : `Run timed out after ${timeoutDuration} due to missing heartbeats. This typically happens when CPU-heavy work blocks the main thread.`;
packages/core/src/v3/heartbeats/api.ts (2)

41-43: Allow optional override when registering a global manager

Useful for tests/dev hot-reload; keeps API explicit.

-  public setGlobalManager(manager: HeartbeatsManager): boolean {
-    return registerGlobal(API_NAME, manager);
-  }
+  public setGlobalManager(manager: HeartbeatsManager, allowOverride = false): boolean {
+    return registerGlobal(API_NAME as any, manager, allowOverride);
+  }

58-65: Annotate return types and avoid returning values for void methods

Clarifies API shape and prevents accidental Promise propagation if implementations change.

-  public startHeartbeat(id: string) {
-    return this.#getManager().startHeartbeat(id);
-  }
+  public startHeartbeat(id: string): void {
+    this.#getManager().startHeartbeat(id);
+  }
 
-  public stopHeartbeat() {
-    return this.#getManager().stopHeartbeat();
-  }
+  public stopHeartbeat(): void {
+    this.#getManager().stopHeartbeat();
+  }
packages/core/src/v3/heartbeats/manager.ts (4)

22-28: Yield without adding artificial latency

24ms adds avoidable delay; a 0ms timeout is sufficient to yield cooperatively.

-    // Only call setImmediate if we haven't yielded in the last interval
+    // Only yield if we haven't yielded in the last interval
     if (Date.now() - this.lastHeartbeatYieldTime >= this.intervalInMs) {
-      // await setImmediate();
-      await setTimeout(24);
+      await new Promise<void>((r) => setTimeout(r, 0));
 
       this.lastHeartbeatYieldTime = Date.now();
     }

45-50: Only mark lastHeartbeat on success and preserve error details

Keep lastHeartbeat meaningful (successful send) and log the error object to retain stack/cause.

-          const [error] = await tryCatch(this.listener(id));
-          this.lastHeartbeatDate = new Date();
-
-          if (error) {
-            console.error("Failed to send HEARTBEAT message", { error: String(error) });
-          }
+          const [error] = await tryCatch(this.listener(id));
+          if (error) {
+            console.error("Failed to send HEARTBEAT message", { error });
+          } else {
+            this.lastHeartbeatDate = new Date();
+          }

Would you like lastHeartbeat to represent “last attempted” instead? If so, keep the original semantics and just adjust the logger.


59-61: Clear state on stop to avoid stale controller and gating

Optional, but makes stopHeartbeat fully idempotent and resets yield throttling until restarted.

   stopHeartbeat(): void {
-    this.currentAbortController?.abort();
+    this.currentAbortController?.abort();
+    this.currentAbortController = undefined;
+    this.lastHeartbeatYieldTime = undefined;
   }

11-16: Consider returning an unsubscribe from registerListener

Low-priority: returning a disposer helps tests swap listeners safely, even if you keep a single global callback at runtime.

-  registerListener(callback: (id: string) => Promise<void>) {
-    this.listener = callback;
-  }
+  registerListener(callback: (id: string) => Promise<void>): () => void {
+    this.listener = callback;
+    return () => {
+      if (this.listener === callback) this.listener = undefined;
+    };
+  }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b9b17e2 and 70b16ee.

⛔ Files ignored due to path filters (1)
  • references/hello-world/src/trigger/cpuHeavy.ts is excluded by !references/**
📒 Files selected for processing (20)
  • apps/webapp/app/env.server.ts (1 hunks)
  • docs/troubleshooting.mdx (1 hunks)
  • internal-packages/run-engine/src/engine/errors.ts (2 hunks)
  • internal-packages/run-engine/src/engine/index.ts (4 hunks)
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (7 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts (8 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts (7 hunks)
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts (6 hunks)
  • packages/cli-v3/src/executions/taskRunProcess.ts (3 hunks)
  • packages/core/src/v3/errors.ts (2 hunks)
  • packages/core/src/v3/heartbeats-api.ts (1 hunks)
  • packages/core/src/v3/heartbeats/api.ts (1 hunks)
  • packages/core/src/v3/heartbeats/manager.ts (1 hunks)
  • packages/core/src/v3/heartbeats/types.ts (1 hunks)
  • packages/core/src/v3/index.ts (1 hunks)
  • packages/core/src/v3/links.ts (1 hunks)
  • packages/core/src/v3/utils/globals.ts (2 hunks)
  • packages/core/src/v3/workers/index.ts (1 hunks)
  • packages/trigger-sdk/src/v3/heartbeats.ts (1 hunks)
  • packages/trigger-sdk/src/v3/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations

Files:

  • packages/core/src/v3/heartbeats-api.ts
  • packages/core/src/v3/index.ts
  • packages/core/src/v3/workers/index.ts
  • packages/core/src/v3/errors.ts
  • apps/webapp/app/env.server.ts
  • packages/core/src/v3/utils/globals.ts
  • packages/trigger-sdk/src/v3/heartbeats.ts
  • packages/core/src/v3/links.ts
  • packages/cli-v3/src/executions/taskRunProcess.ts
  • packages/core/src/v3/heartbeats/types.ts
  • packages/trigger-sdk/src/v3/index.ts
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts
  • internal-packages/run-engine/src/engine/index.ts
  • packages/core/src/v3/heartbeats/api.ts
  • internal-packages/run-engine/src/engine/errors.ts
  • packages/core/src/v3/heartbeats/manager.ts
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

We use zod a lot in packages/core and in the webapp

Files:

  • packages/core/src/v3/heartbeats-api.ts
  • packages/core/src/v3/index.ts
  • packages/core/src/v3/workers/index.ts
  • packages/core/src/v3/errors.ts
  • apps/webapp/app/env.server.ts
  • packages/core/src/v3/utils/globals.ts
  • packages/core/src/v3/links.ts
  • packages/core/src/v3/heartbeats/types.ts
  • packages/core/src/v3/heartbeats/api.ts
  • packages/core/src/v3/heartbeats/manager.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

When importing from @trigger.dev/core in the webapp, never import the root package path; always use one of the documented subpath exports from @trigger.dev/core’s package.json

Files:

  • apps/webapp/app/env.server.ts
{apps/webapp/app/**/*.server.{ts,tsx},apps/webapp/app/routes/**/*.ts}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access environment variables only via the env export from app/env.server.ts; do not reference process.env directly

Files:

  • apps/webapp/app/env.server.ts
apps/webapp/app/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Modules intended for test consumption under apps/webapp/app/**/*.ts must not read environment variables; accept configuration via options instead

Files:

  • apps/webapp/app/env.server.ts
🧠 Learnings (6)
📚 Learning: 2024-10-18T15:41:52.352Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#1418
File: packages/core/src/v3/errors.ts:364-371
Timestamp: 2024-10-18T15:41:52.352Z
Learning: In `packages/core/src/v3/errors.ts`, within the `taskRunErrorEnhancer` function, `error.message` is always defined, so it's safe to directly call `error.message.includes("SIGTERM")` without additional checks.

Applied to files:

  • packages/core/src/v3/errors.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Import Trigger.dev APIs from "trigger.dev/sdk/v3" when writing tasks or related utilities

Applied to files:

  • packages/trigger-sdk/src/v3/heartbeats.ts
  • packages/trigger-sdk/src/v3/index.ts
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts
📚 Learning: 2025-08-29T10:06:49.293Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-08-29T10:06:49.293Z
Learning: Applies to apps/webapp/**/*.{ts,tsx} : When importing from trigger.dev/core in the webapp, never import the root package path; always use one of the documented subpath exports from trigger.dev/core’s package.json

Applied to files:

  • packages/trigger-sdk/src/v3/index.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Define tasks using task({ id, run, ... }) with a unique id per project

Applied to files:

  • docs/troubleshooting.mdx
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : For idempotent child-task invocations, create and pass idempotencyKey (and optional TTL) when calling trigger()/batchTrigger() from tasks

Applied to files:

  • docs/troubleshooting.mdx
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Use triggerAndWait() only from within a task context (not from generic app code) and handle result.ok or use unwrap() with error handling

Applied to files:

  • docs/troubleshooting.mdx
🧬 Code graph analysis (14)
packages/core/src/v3/heartbeats-api.ts (2)
packages/trigger-sdk/src/v3/heartbeats.ts (1)
  • heartbeats (41-44)
packages/core/src/v3/heartbeats/api.ts (1)
  • HeartbeatsAPI (28-73)
packages/core/src/v3/errors.ts (1)
packages/core/src/v3/links.ts (1)
  • links (1-33)
packages/core/src/v3/utils/globals.ts (2)
packages/core/src/v3/heartbeats/api.ts (1)
  • HeartbeatsManager (70-72)
packages/core/src/v3/heartbeats/types.ts (1)
  • HeartbeatsManager (1-8)
packages/trigger-sdk/src/v3/heartbeats.ts (1)
packages/core/src/v3/heartbeats-api.ts (1)
  • heartbeats (5-5)
packages/cli-v3/src/executions/taskRunProcess.ts (1)
packages/core/src/utils.ts (1)
  • tryCatch (5-18)
packages/core/src/v3/heartbeats/types.ts (2)
packages/core/src/v3/heartbeats/api.ts (1)
  • HeartbeatsManager (70-72)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (1)
  • id (1582-1589)
packages/cli-v3/src/entryPoints/dev-run-worker.ts (3)
packages/core/src/v3/heartbeats/manager.ts (1)
  • StandardHeartbeatsManager (5-77)
packages/core/src/v3/heartbeats-api.ts (1)
  • heartbeats (5-5)
packages/core/src/v3/idempotencyKeys.ts (1)
  • attemptKey (132-134)
internal-packages/run-engine/src/engine/index.ts (1)
packages/core/src/v3/index.ts (1)
  • formatDurationMilliseconds (35-35)
packages/core/src/v3/heartbeats/api.ts (2)
packages/core/src/v3/heartbeats/types.ts (1)
  • HeartbeatsManager (1-8)
packages/core/src/v3/utils/globals.ts (3)
  • registerGlobal (21-37)
  • unregisterGlobal (45-51)
  • getGlobal (39-43)
internal-packages/run-engine/src/engine/errors.ts (1)
packages/core/src/v3/schemas/common.ts (4)
  • TaskRunError (200-205)
  • TaskRunError (207-207)
  • RuntimeEnvironmentType (547-552)
  • RuntimeEnvironmentType (554-555)
packages/core/src/v3/heartbeats/manager.ts (3)
packages/core/src/v3/heartbeats/api.ts (1)
  • HeartbeatsManager (70-72)
packages/core/src/v3/heartbeats/types.ts (1)
  • HeartbeatsManager (1-8)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (1)
  • id (1582-1589)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (2)
internal-packages/run-engine/src/engine/errors.ts (2)
  • ServiceValidationError (69-78)
  • runStatusFromError (5-67)
internal-packages/run-engine/src/engine/systems/executionSnapshotSystem.ts (2)
  • EnhancedExecutionSnapshot (20-25)
  • status (428-449)
packages/cli-v3/src/entryPoints/managed-run-worker.ts (4)
packages/core/src/v3/heartbeats/manager.ts (1)
  • StandardHeartbeatsManager (5-77)
packages/core/src/v3/heartbeats-api.ts (1)
  • heartbeats (5-5)
packages/trigger-sdk/src/v3/heartbeats.ts (1)
  • heartbeats (41-44)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (1)
  • id (1582-1589)
packages/cli-v3/src/entryPoints/dev-run-controller.ts (1)
packages/core/src/v3/errors.ts (1)
  • SuspendedProcessError (527-533)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
🔇 Additional comments (24)
packages/core/src/v3/links.ts (1)

17-18: LGTM: new troubleshooting link

Matches the new doc anchor.

packages/cli-v3/src/entryPoints/dev-run-controller.ts (5)

7-7: Import of SuspendedProcessError

OK; used to special-case suspend exits.


528-533: Suspend path handling looks correct

Gracefully exits without failing the run on suspend.


598-599: Run attempt lifecycle flag initialization

Good reset before execution.


670-671: Marking completion state before returning process

Appropriate to guard double-completion flows.


792-822: Idempotent shutdown helper: good addition

Centralizes teardown; prevents duplicate onFinished calls.

packages/core/src/v3/workers/index.ts (1)

32-32: Good re-export; keeps workers surface coherent.

No issues. This enables CLI workers to import the standard manager from a stable path.

packages/trigger-sdk/src/v3/index.ts (1)

18-18: SDK now surfaces heartbeats — looks correct.

Additive, no breaking changes. Please ensure docs/examples reference import { heartbeats } from "trigger.dev/sdk/v3".

If helpful, I can open a docs follow-up PR to add usage examples. Want me to draft it?

packages/core/src/v3/heartbeats-api.ts (1)

1-5: Singleton entrypoint is fine and side-effect free when unused.

Naming aligns with other v3 “-api” modules; no default export; tree-shakeable. LGTM.

packages/core/src/v3/index.ts (1)

21-21: Publicly exposing heartbeats API via v3 index is correct.

Additive export; no ordering hazards observed.

packages/core/src/v3/utils/globals.ts (1)

3-3: Key name consistency verified
The API_NAME constant ("heartbeats") in heartbeats/api.ts matches the ["heartbeats"] slot in globals.ts. No changes needed.

packages/core/src/v3/errors.ts (1)

285-347: Confirm stalled-executing errors are intended to be retriable
internal-packages/run-engine/src/engine/retrying.ts and apps/webapp/app/v3/services/completeAttempt.server.ts both treat TASK_RUN_STALLED_EXECUTING (and its wait-points variant) as retriable—mapping to CANCELED in DEV and triggering retries. Verify this behavior is desired to avoid excessive churn in DEV.

internal-packages/run-engine/src/engine/errors.ts (1)

5-8: Verified no stale 1-arg calls remain. All invocations of runStatusFromError now pass two arguments.

packages/trigger-sdk/src/v3/heartbeats.ts (2)

30-32: Good wrapper: awaitable yield is correctly delegated

The wrapper cleanly forwards to the core API and keeps the public surface small.


41-44: Public API surface looks good

Minimal, intuitive, aligns with docs.

packages/cli-v3/src/entryPoints/dev-run-worker.ts (2)

563-564: Heartbeat lifecycle closed correctly

Stopping the heartbeat in finally ensures no leaks between runs.


698-700: Listener wiring is correct and awaits IPC send

The await prevents message pile-up; manager already guards/logs failures internally.

packages/cli-v3/src/entryPoints/managed-run-worker.ts (2)

299-300: Reset covers heartbeats as well

Good inclusion to ensure clean state before next run.


735-737: Listener registration LGTM

Heartbeats emit via IPC on schedule; matches core manager contract.

internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (4)

694-696: Good guard: prevent succeeding an already finished run

This avoids double-finishing edge cases under races.


851-853: Good guard: prevent failing an already finished run

Same rationale; this is a safe, defensive check.


933-941: Passing latestSnapshot object through failure path is cleaner and correct

Using EnhancedExecutionSnapshot ensures correct previousSnapshotId and environment-aware status mapping.

Also applies to: 1452-1476, 1522-1522


873-914: Confirm forceRequeue-driven runAttemptFailed emission semantics
forceRequeue always emits a runAttemptFailed event (runAttemptSystem.ts L873–914) even if the final outcome is fail_run or cancel_run; verify downstream analytics/UX won’t double-report or confuse terminal failure signals.

packages/core/src/v3/heartbeats/manager.ts (1)

1-78: Node-only contexts confirmed; node:timers/promises imports are safe. StandardHeartbeatsManager is only instantiated in CLI entry points and server-side (.server.ts) code with no client-side TSX or browser bundles, so no bundler breakage will occur.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (6)
packages/core/src/v3/heartbeats/api.ts (1)

6-24: Resolved: Noop manager matches the interface now

startHeartbeat/stopHeartbeat are sync and yield returns Promise. Looks good.

packages/cli-v3/src/entryPoints/managed-run-worker.ts (2)

533-534: Prefer attempt-level heartbeat IDs for parity and cleaner lifecycle

Use attemptKey(execution) if available so heartbeats tie to attempts, not runs.

-          standardHeartbeatsManager.startHeartbeat(_execution.run.id);
+          // Prefer attempt-level ID to avoid collisions across retries
+          standardHeartbeatsManager.startHeartbeat(attemptKey(execution));

To locate attemptKey in the repo (and confirm import path), run:

#!/bin/bash
rg -nP -C2 '\battemptKey\s*\(' --type=ts

139-143: Sanitize HEARTBEAT_INTERVAL_MS to avoid 0/NaN tight loops

Clamp and default the interval before constructing the manager.

-const standardHeartbeatsManager = new StandardHeartbeatsManager(
-  parseInt(heartbeatIntervalMs ?? "30000", 10)
-);
+const intervalMs = (() => {
+  const n = Number(heartbeatIntervalMs);
+  return Number.isFinite(n) && n > 0 ? Math.max(250, Math.floor(n)) : 30_000;
+})();
+const standardHeartbeatsManager = new StandardHeartbeatsManager(intervalMs);
 heartbeats.setGlobalManager(standardHeartbeatsManager);
packages/core/src/v3/heartbeats/manager.ts (2)

1-4: Avoid Node-only timers in core; make timers isomorphic

node:timers/promises breaks web/edge bundling. Replace with an abortable sleep helper and an async while loop.

 import { tryCatch } from "../tryCatch.js";
 import { HeartbeatsManager } from "./types.js";
-import { setInterval, setImmediate, setTimeout } from "node:timers/promises";
+
+// Isomorphic helpers (work in Node, browser, and edge)
+const delay = (ms: number) =>
+  new Promise<void>((resolve) => globalThis.setTimeout(resolve, ms));
+
+function abortableSleep(ms: number, signal: AbortSignal): Promise<void> {
+  return new Promise<void>((resolve, reject) => {
+    if (signal.aborted) return reject(Object.assign(new Error("Aborted"), { name: "AbortError" }));
+    const id = globalThis.setTimeout(() => {
+      signal.removeEventListener("abort", onAbort);
+      resolve();
+    }, ms);
+    function onAbort() {
+      globalThis.clearTimeout(id);
+      reject(Object.assign(new Error("Aborted"), { name: "AbortError" }));
+    }
+    signal.addEventListener("abort", onAbort, { once: true });
+  });
+}

40-58: Refactor loop to isomorphic, abortable while-loop and only update lastHeartbeat on success

Removes Node dependency and avoids reporting lastHeartbeat when the send failed.

-  private async startHeartbeatLoop(id: string, signal: AbortSignal) {
-    try {
-      for await (const _ of setInterval(this.intervalInMs, undefined, {
-        signal,
-      })) {
-        if (this.listener) {
-          const [error] = await tryCatch(this.listener(id));
-          this.lastHeartbeatDate = new Date();
-
-          if (error) {
-            console.error("Failed to send HEARTBEAT message", { error: String(error) });
-          }
-        }
-      }
-    } catch (error) {
-      // Ignore errors as we expect them to be thrown when the heartbeat is stopped
-      // And since we tryCatch inside the loop, we don't need to handle any other errors here
-    }
-  }
+  private async startHeartbeatLoop(id: string, signal: AbortSignal) {
+    try {
+      while (!signal.aborted) {
+        await abortableSleep(this.intervalInMs, signal);
+        if (signal.aborted) break;
+        if (this.listener) {
+          const [error] = await tryCatch(this.listener(id));
+          if (error) {
+            console.error("Failed to send HEARTBEAT message", { error: String(error) });
+          } else {
+            this.lastHeartbeatDate = new Date();
+          }
+        }
+      }
+    } catch {
+      // Expected on abort; no-op.
+    }
+  }
packages/cli-v3/src/entryPoints/dev-run-controller.ts (1)

53-55: SIGTERM handler uses unstable reference; off() won’t remove it and this-binding is fragile

Registering with this.sigterm.bind(this) creates a new function, but off("SIGTERM", this.sigterm) uses a different reference → leak and potential wrong this. Use a stable field.

Apply:

@@
-  private isCompletingRun = false;
-  private isShuttingDown = false;
+  private isCompletingRun = false;
+  private isShuttingDown = false;
+  private readonly onSigterm = () => { void this.sigterm(); };
@@
-    process.on("SIGTERM", this.sigterm.bind(this));
+    process.on("SIGTERM", this.onSigterm);
@@
-    process.off("SIGTERM", this.sigterm);
+    process.off("SIGTERM", this.onSigterm);

Also applies to: 126-127, 859-860

🧹 Nitpick comments (3)
packages/core/src/v3/heartbeats/api.ts (2)

41-47: Expose allowOverride for controlled re-registration

Some runtimes may need to swap managers (dev ↔ prod). Consider threading allowOverride through to registerGlobal.

-  public setGlobalManager(manager: HeartbeatsManager): boolean {
-    return registerGlobal(API_NAME, manager);
-  }
+  public setGlobalManager(
+    manager: HeartbeatsManager,
+    { allowOverride = false }: { allowOverride?: boolean } = {}
+  ): boolean {
+    return registerGlobal(API_NAME, manager, allowOverride);
+  }

49-56: Clarify reset semantics (resets then disables)

reset() currently clears the underlying manager and unregisters the API. That’s a bit surprising for a method named reset. Either document the behavior or split into reset() and resetAndDisable().

+  /**
+   * Resets the current manager state and then disables the global API.
+   * NOTE: Call setGlobalManager(...) again to re-enable.
+   */
   public reset() {
     this.#getManager().reset();
     this.disable();
   }
packages/cli-v3/src/entryPoints/managed-run-worker.ts (1)

719-724: Also sanitize usage heartbeat interval (minor)

Mirror the clamp/default logic to avoid overly chatty usage heartbeats in misconfigurations.

-  const prodUsageManager = new ProdUsageManager(devUsageManager, {
-    heartbeatIntervalMs: usageIntervalMs ? parseInt(usageIntervalMs, 10) : undefined,
+  const prodUsageManager = new ProdUsageManager(devUsageManager, {
+    heartbeatIntervalMs: (() => {
+      if (!usageIntervalMs) return undefined;
+      const n = Number(usageIntervalMs);
+      return Number.isFinite(n) && n > 0 ? Math.max(1_000, Math.floor(n)) : undefined;
+    })(),
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 70b16ee and 928396a.

📒 Files selected for processing (4)
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts (9 hunks)
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts (7 hunks)
  • packages/core/src/v3/heartbeats/api.ts (1 hunks)
  • packages/core/src/v3/heartbeats/manager.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations

Files:

  • packages/cli-v3/src/entryPoints/dev-run-controller.ts
  • packages/core/src/v3/heartbeats/manager.ts
  • packages/core/src/v3/heartbeats/api.ts
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

We use zod a lot in packages/core and in the webapp

Files:

  • packages/core/src/v3/heartbeats/manager.ts
  • packages/core/src/v3/heartbeats/api.ts
🧠 Learnings (2)
📚 Learning: 2024-10-18T15:41:52.352Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#1418
File: packages/core/src/v3/errors.ts:364-371
Timestamp: 2024-10-18T15:41:52.352Z
Learning: In `packages/core/src/v3/errors.ts`, within the `taskRunErrorEnhancer` function, `error.message` is always defined, so it's safe to directly call `error.message.includes("SIGTERM")` without additional checks.

Applied to files:

  • packages/cli-v3/src/entryPoints/dev-run-controller.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Import Trigger.dev APIs from "trigger.dev/sdk/v3" when writing tasks or related utilities

Applied to files:

  • packages/cli-v3/src/entryPoints/managed-run-worker.ts
🧬 Code graph analysis (4)
packages/cli-v3/src/entryPoints/dev-run-controller.ts (1)
packages/core/src/v3/errors.ts (1)
  • SuspendedProcessError (527-533)
packages/core/src/v3/heartbeats/manager.ts (2)
packages/core/src/v3/heartbeats/api.ts (1)
  • HeartbeatsManager (70-72)
packages/core/src/v3/heartbeats/types.ts (1)
  • HeartbeatsManager (1-8)
packages/core/src/v3/heartbeats/api.ts (2)
packages/core/src/v3/heartbeats/types.ts (1)
  • HeartbeatsManager (1-8)
packages/core/src/v3/utils/globals.ts (3)
  • registerGlobal (21-37)
  • unregisterGlobal (45-51)
  • getGlobal (39-43)
packages/cli-v3/src/entryPoints/managed-run-worker.ts (5)
packages/core/src/v3/heartbeats/manager.ts (1)
  • StandardHeartbeatsManager (5-78)
packages/core/src/v3/workers/index.ts (1)
  • StandardHeartbeatsManager (32-32)
packages/core/src/v3/heartbeats-api.ts (1)
  • heartbeats (5-5)
packages/trigger-sdk/src/v3/heartbeats.ts (1)
  • heartbeats (41-44)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (1)
  • id (1582-1589)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (6)
packages/cli-v3/src/entryPoints/managed-run-worker.ts (3)

561-564: Good: always stop heartbeat in finally

Prevents leaks if execution throws or is cancelled.


296-301: Good: reset heartbeats with the rest of the environment

Ensures no stale timers or state across runs.


737-739: LGTM: event-driven heartbeat wiring

Listener sends TASK_HEARTBEAT on each tick; aligns with the new manager.

packages/cli-v3/src/entryPoints/dev-run-controller.ts (3)

7-7: Importing SuspendedProcessError — good usage

Using SuspendedProcessError to distinguish suspend vs. fail paths makes the error flow clearer.


598-599: isCompletingRun flags — appropriate gating

Setting/resetting around execute/complete provides a simple guard against double-finish races.

Also applies to: 670-671


528-533: SuspendedProcessError branch — appropriate fast-exit

Treating a suspension as a non-error path and finishing without completion submission is correct.

… prod behavior, adding heartbeats.yield utility
…age, add more information to the docs, improve resource monitor and add it to the docs
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (7)
packages/core/src/v3/errors.ts (1)

635-640: Also map the WITH_WAITPOINTS variant to the same troubleshooting link.

Without this, users hitting the waitpoints flavor won’t see the guide.

   TASK_RUN_STALLED_EXECUTING: {
     link: {
       name: "Read our troubleshooting guide",
       href: links.docs.troubleshooting.stalledExecution,
     },
   },
+  TASK_RUN_STALLED_EXECUTING_WITH_WAITPOINTS: {
+    link: {
+      name: "Read our troubleshooting guide",
+      href: links.docs.troubleshooting.stalledExecution,
+    },
+  },
packages/cli-v3/src/entryPoints/dev-run-controller.ts (3)

126-127: SIGTERM listener not removed — use a stable handler reference

The handler is registered with a bound function but removed with an unbound reference, so the listener leaks.

@@
-    process.on("SIGTERM", this.sigterm.bind(this));
+    process.on("SIGTERM", this.onSigterm);
@@
-    process.off("SIGTERM", this.sigterm);
+    process.off("SIGTERM", this.onSigterm);

Add a stable field:

@@
   private isShuttingDown = false;
+  private readonly onSigterm = () => { void this.sigterm(); }

Also applies to: 859-860


360-369: FINISHED path leaks the borrowed process and leaves listeners installed

On FINISHED you suspend and call shutdown, but shutdownExecution doesn’t return the process to the pool, unsubscribe run notifications, or detach SIGTERM; exitTaskRunProcessWithoutFailingRun also doesn’t guard suspend errors. This causes pool starvation and listener leaks.

@@
         case "FINISHED": {
           if (this.isCompletingRun) {
             logger.debug("Run is finished but we're completing it, skipping");
             return;
           }
 
           await this.exitTaskRunProcessWithoutFailingRun({
             flush: true,
             reason: "already-finished",
           });
           return;
         }
@@
-  private async exitTaskRunProcessWithoutFailingRun({
+  private async exitTaskRunProcessWithoutFailingRun({
     flush,
     reason,
   }: {
     flush: boolean;
     reason: string;
   }) {
-    await this.taskRunProcess?.suspend({ flush });
-
-    // No services should be left running after this line - let's make sure of it
-    this.shutdownExecution(`exitTaskRunProcessWithoutFailingRun: ${reason}`);
+    try {
+      await this.taskRunProcess?.suspend({ flush });
+    } catch (error) {
+      logger.debug("Failed to suspend task run process", { error });
+    }
+    // No services should be left running after this line - let's make sure of it
+    await this.shutdownExecution(`exitTaskRunProcessWithoutFailingRun: ${reason}`);
   }
 
-  private shutdownExecution(reason: string) {
+  private async shutdownExecution(reason: string) {
     if (this.isShuttingDown) {
       logger.debug(`[shutdown] ${reason} (already shutting down)`, {
         newReason: reason,
       });
       return;
     }
 
     logger.debug(`[shutdown] ${reason}`);
 
     this.isShuttingDown = true;
 
+    // Detach signal handler early
+    process.off("SIGTERM", this.onSigterm);
+
     this.snapshotPoller.stop();
 
-    this.opts.onFinished();
-    this.taskRunProcess?.unsafeDetachEvtHandlers();
+    // Unsubscribe if still subscribed
+    if (this.state.phase === "RUN") {
+      try {
+        this.unsubscribeFromRunNotifications(this.state);
+      } catch (error) {
+        logger.debug("[shutdown] Failed to unsubscribe from run notifications", { error });
+      }
+      this.state = { phase: "IDLE" };
+    }
+
+    // Return process to pool if held
+    if (this.taskRunProcess) {
+      try {
+        const version = this.opts.worker.serverWorker?.version || "unknown";
+        await this.opts.taskRunProcessPool.returnProcess(this.taskRunProcess, version);
+      } catch (error) {
+        logger.debug("[shutdown] Failed to return task run process to pool", { error });
+      } finally {
+        this.taskRunProcess = undefined;
+      }
+    }
+
+    this.taskRunProcess?.unsafeDetachEvtHandlers();
+    this.opts.onFinished();
   }

Also applies to: 792-804, 805-821


622-643: Wrap heartbeat await in try/catch to avoid unhandled rejections

Network/cancellation can throw and surface as unhandled rejections in the event handler.

-    taskRunProcess.onTaskRunHeartbeat.attach(async (runId) => {
-      if (!this.runFriendlyId || !this.snapshotFriendlyId) {
-        logger.debug("[DevRunController] Skipping heartbeat, no run ID or snapshot ID");
-        return;
-      }
-
-      logger.debug("[DevRunController] Sending heartbeat");
-
-      const response = await this.httpClient.dev.heartbeatRun(
-        this.runFriendlyId,
-        this.snapshotFriendlyId,
-        {
-          cpu: 0,
-          memory: 0,
-        }
-      );
-
-      if (!response.success) {
-        logger.debug("[DevRunController] Heartbeat failed", { error: response.error });
-      }
-    });
+    taskRunProcess.onTaskRunHeartbeat.attach(async (_runId) => {
+      try {
+        if (!this.runFriendlyId || !this.snapshotFriendlyId) {
+          logger.debug("[DevRunController] Skipping heartbeat, no run ID or snapshot ID");
+          return;
+        }
+        logger.debug("[DevRunController] Sending heartbeat");
+        const response = await this.httpClient.dev.heartbeatRun(
+          this.runFriendlyId,
+          this.snapshotFriendlyId,
+          { cpu: 0, memory: 0 }
+        );
+        if (!response.success) {
+          logger.debug("[DevRunController] Heartbeat failed", { error: response.error });
+        }
+      } catch (error) {
+        logger.debug("[DevRunController] Heartbeat threw", { error });
+      }
+    });
packages/core/src/v3/heartbeats/manager.ts (3)

11-12: Sanitize and clamp interval; avoid parameter property for validation

Prevents invalid/zero/negative intervals and ensures integer ms.

-export class StandardHeartbeatsManager implements HeartbeatsManager {
-  private listener: ((id: string) => Promise<void>) | undefined = undefined;
-  private currentAbortController: AbortController | undefined = undefined;
-  private lastHeartbeatYieldTime: number | undefined = undefined;
-  private lastHeartbeatDate: Date | undefined = undefined;
-
-  constructor(private readonly intervalInMs: number) {}
+export class StandardHeartbeatsManager implements HeartbeatsManager {
+  private listener: ((id: string) => Promise<void>) | undefined = undefined;
+  private currentAbortController: AbortController | undefined = undefined;
+  private lastHeartbeatYieldTime: number | undefined = undefined;
+  private lastHeartbeatDate: Date | undefined = undefined;
+  private readonly intervalInMs: number;
+
+  constructor(intervalInMs: number) {
+    const n = Math.floor(Number(intervalInMs));
+    this.intervalInMs = Number.isFinite(n) && n > 0 ? Math.max(250, n) : 1_000;
+  }

31-38: Good: stop previous heartbeat before starting a new one

Prevents overlapping loops and controller leaks. LGTM.


1-4: Make heartbeats isomorphic; remove node:timers/promises dependency

Core should be runtime-agnostic (Node/web/edge). Replace node-only timers with an abortable sleep loop; also update lastHeartbeatDate only on successful sends.

-import { tryCatch } from "../tryCatch.js";
-import { HeartbeatsManager } from "./types.js";
-import { setInterval, setImmediate, setTimeout } from "node:timers/promises";
+import { tryCatch } from "../tryCatch.js";
+import { HeartbeatsManager } from "./types.js";
+
+// Isomorphic delay
+function delay(ms: number): Promise<void> {
+  return new Promise((resolve) => setTimeout(resolve, ms));
+}
+
+// Abortable sleep compatible with Node, browser, and edge runtimes
+function abortableSleep(ms: number, signal: AbortSignal): Promise<void> {
+  return new Promise<void>((resolve, reject) => {
+    const id: any = setTimeout(resolve, ms);
+    const onAbort = () => {
+      clearTimeout(id);
+      const err = new Error("Aborted");
+      // @ts-expect-error: mark for diagnostics
+      err.name = "AbortError";
+      reject(err);
+    };
+    if (signal.aborted) {
+      onAbort();
+      return;
+    }
+    signal.addEventListener("abort", onAbort, { once: true });
+  });
+}
@@
   async yield(): Promise<void> {
     if (!this.lastHeartbeatYieldTime) {
       return;
     }
 
-    // Only call setImmediate if we haven't yielded in the last interval
+    // Throttle yielding to at most once per interval
     if (Date.now() - this.lastHeartbeatYieldTime >= this.intervalInMs) {
-      // await setImmediate();
-      await setTimeout(24);
+      await delay(24);
 
       this.lastHeartbeatYieldTime = Date.now();
     }
   }
@@
-  private async startHeartbeatLoop(id: string, signal: AbortSignal) {
+  private async startHeartbeatLoop(id: string, signal: AbortSignal) {
     try {
-      for await (const _ of setInterval(this.intervalInMs, undefined, {
-        signal,
-      })) {
-        if (this.listener) {
-          const [error] = await tryCatch(this.listener(id));
-          this.lastHeartbeatDate = new Date();
-
-          if (error) {
-            console.error("Failed to send HEARTBEAT message", { error: String(error) });
-          }
-        }
-      }
+      while (!signal.aborted) {
+        await abortableSleep(this.intervalInMs, signal);
+        if (signal.aborted) break;
+        if (this.listener) {
+          const [error] = await tryCatch(this.listener(id));
+          if (error) {
+            console.error("Failed to send HEARTBEAT message", { error });
+          } else {
+            this.lastHeartbeatDate = new Date();
+          }
+        }
+      }
     } catch (error) {
       // Ignore errors as we expect them to be thrown when the heartbeat is stopped
       // And since we tryCatch inside the loop, we don't need to handle any other errors here
     }
   }

Also applies to: 17-29, 40-58

🧹 Nitpick comments (12)
packages/trigger-sdk/src/v3/index.ts (1)

18-18: Narrow the re-export to the named symbol (heartbeats) — avoid export *

heartbeats is a named export in ./heartbeats.js; there is no default export or HeartbeatsAPI type to re-export, so replace the wildcard with an explicit named export.

-export * from "./heartbeats.js";
+export { heartbeats } from "./heartbeats.js";
internal-packages/run-engine/src/engine/types.ts (1)

74-74: Document the new option for clarity.

Add a brief JSDoc so consumers understand behavior and default.

   heartbeatTimeoutsMs?: Partial<HeartbeatTimeouts>;
-  treatProductionExecutionStallsAsOOM?: boolean;
+  /** When true, production EXECUTING stalls are surfaced as TASK_PROCESS_OOM_KILLED instead of stall errors. Default: false. */
+  treatProductionExecutionStallsAsOOM?: boolean;
packages/core/src/v3/errors.ts (1)

312-316: Remove stale TODO.

The stalled-executing codes are already handled here; the TODO is obsolete.

-        //new heartbeat error
-        //todo
         case "TASK_RUN_STALLED_EXECUTING":
         case "TASK_RUN_STALLED_EXECUTING_WITH_WAITPOINTS":
internal-packages/run-engine/src/engine/index.ts (2)

217-218: Use logger instead of console.log for consistency.

Route through the configured Logger.

-      console.log("✅ Starting run engine worker");
+      this.logger.info("✅ Starting run engine worker");

1265-1296: Messaging tweaks: avoid stale heartbeat interval and align OOM text.

  • Don’t hardcode “heartbeats (sent every 30s)”—this may drift.
  • Match the OOM message to the pretty error copy for consistency.
-          const errorMessage =
-            latestSnapshot.environmentType === "DEVELOPMENT"
-              ? `Run timed out after ${timeoutDuration} due to missing heartbeats (sent every 30s). Check if your \`trigger.dev dev\` CLI is still running, or if CPU-heavy work is blocking the main thread.`
-              : `Run timed out after ${timeoutDuration} due to missing heartbeats (sent every 30s). This typically happens when CPU-heavy work blocks the main thread.`;
+          const errorMessage =
+            latestSnapshot.environmentType === "DEVELOPMENT"
+              ? `Run timed out after ${timeoutDuration} due to missing heartbeats. Check if your \`trigger.dev dev\` CLI is still running, or if CPU-heavy work is blocking the main thread.`
+              : `Run timed out after ${timeoutDuration} due to missing heartbeats. This typically happens when CPU-heavy work blocks the main thread.`;
@@
-              ? ({
-                  type: "INTERNAL_ERROR",
-                  code: "TASK_PROCESS_OOM_KILLED",
-                  message: "Run was terminated due to running out of memory",
-                } satisfies TaskRunInternalError)
+              ? ({
+                  type: "INTERNAL_ERROR",
+                  code: "TASK_PROCESS_OOM_KILLED",
+                  message:
+                    "Your run was terminated due to exceeding the machine's memory limit. Try increasing the machine preset in your task options or replay using a larger machine.",
+                } satisfies TaskRunInternalError)
docs/machines.mdx (7)

33-41: Table/unit consistency (GB vs GiB)

Elsewhere in the doc/code you calculate using 1024-based units (GiB), but the table labels “10GB”. Consider standardizing on either GB (10^9) or GiB (2^30) across the page to avoid confusion.


178-181: Disk limit unit mismatch and naming

DISK_LIMIT_GB is computed with 1024-based math and later displayed as “GiB”. Either rename to DISK_LIMIT_GiB or convert using 1000-based math to align with “GB”.

Also applies to: 716-721


385-388: PID filter may miss legitimate targets

Filtering to pids > process.pid will ignore older long-lived targets (e.g., services). If the intent is “child-only,” document it; otherwise, remove the filter or make it configurable.

Also applies to: 452-454


198-200: Default verbosity may be too chatty

verbose = true includes full heapStats in every payload. Consider defaulting to false and enabling via config to reduce log volume and PII risk in logs.

Apply this diff:

-    this.verbose = true;
+    this.verbose = config.compactLogging ? false : false;
+    // consider an explicit `verbose` flag in ResourceMonitorConfig

Also applies to: 749-754


896-900: Use extensionless import in TS example

To avoid TS/ESM confusion, drop the .js extension in the TS code sample.

Apply this diff:

-import { ResourceMonitor } from "../resourceMonitor.js";
+import { ResourceMonitor } from "../resourceMonitor";

590-601: Label formatting: parsing stringified percents back to numbers

You stringify percents in payload, then parseFloat them to build labels. Consider keeping numeric fields numeric in payload and only formatting at the edge to simplify consumption.

Also applies to: 606-617


71-75: Platform expectations

ResourceMonitor relies on Linux tools (/proc, du, ps). Add a one-liner note that it targets Trigger’s Linux runtime; behavior on macOS/Windows will differ.

Also applies to: 851-887, 902-905

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 928396a and d1c51c0.

⛔ Files ignored due to path filters (4)
  • docs/images/machines-resource-monitor-ffmpeg.png is excluded by !**/*.png
  • docs/images/machines-resource-monitor-logs.png is excluded by !**/*.png
  • references/hello-world/src/resourceMonitor.ts is excluded by !references/**
  • references/hello-world/src/trigger/cpuHeavy.ts is excluded by !references/**
📒 Files selected for processing (23)
  • apps/webapp/app/env.server.ts (2 hunks)
  • apps/webapp/app/v3/runEngine.server.ts (1 hunks)
  • docs/machines.mdx (5 hunks)
  • docs/troubleshooting.mdx (1 hunks)
  • internal-packages/run-engine/src/engine/errors.ts (2 hunks)
  • internal-packages/run-engine/src/engine/index.ts (3 hunks)
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (7 hunks)
  • internal-packages/run-engine/src/engine/types.ts (1 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts (9 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts (7 hunks)
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts (7 hunks)
  • packages/cli-v3/src/executions/taskRunProcess.ts (3 hunks)
  • packages/core/src/v3/errors.ts (2 hunks)
  • packages/core/src/v3/heartbeats-api.ts (1 hunks)
  • packages/core/src/v3/heartbeats/api.ts (1 hunks)
  • packages/core/src/v3/heartbeats/manager.ts (1 hunks)
  • packages/core/src/v3/heartbeats/types.ts (1 hunks)
  • packages/core/src/v3/index.ts (1 hunks)
  • packages/core/src/v3/links.ts (1 hunks)
  • packages/core/src/v3/utils/globals.ts (2 hunks)
  • packages/core/src/v3/workers/index.ts (1 hunks)
  • packages/trigger-sdk/src/v3/heartbeats.ts (1 hunks)
  • packages/trigger-sdk/src/v3/index.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/troubleshooting.mdx
🚧 Files skipped from review as they are similar to previous changes (13)
  • packages/core/src/v3/heartbeats/types.ts
  • packages/core/src/v3/index.ts
  • packages/core/src/v3/workers/index.ts
  • packages/core/src/v3/heartbeats-api.ts
  • packages/core/src/v3/utils/globals.ts
  • internal-packages/run-engine/src/engine/errors.ts
  • packages/trigger-sdk/src/v3/heartbeats.ts
  • packages/core/src/v3/heartbeats/api.ts
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts
  • internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts
  • packages/cli-v3/src/executions/taskRunProcess.ts
  • apps/webapp/app/env.server.ts
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations

Files:

  • internal-packages/run-engine/src/engine/types.ts
  • packages/core/src/v3/links.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • packages/trigger-sdk/src/v3/index.ts
  • internal-packages/run-engine/src/engine/index.ts
  • packages/core/src/v3/errors.ts
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts
  • packages/core/src/v3/heartbeats/manager.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

We use zod a lot in packages/core and in the webapp

Files:

  • packages/core/src/v3/links.ts
  • apps/webapp/app/v3/runEngine.server.ts
  • packages/core/src/v3/errors.ts
  • packages/core/src/v3/heartbeats/manager.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

When importing from @trigger.dev/core in the webapp, never import the root package path; always use one of the documented subpath exports from @trigger.dev/core’s package.json

Files:

  • apps/webapp/app/v3/runEngine.server.ts
{apps/webapp/app/**/*.server.{ts,tsx},apps/webapp/app/routes/**/*.ts}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Access environment variables only via the env export from app/env.server.ts; do not reference process.env directly

Files:

  • apps/webapp/app/v3/runEngine.server.ts
apps/webapp/app/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

Modules intended for test consumption under apps/webapp/app/**/*.ts must not read environment variables; accept configuration via options instead

Files:

  • apps/webapp/app/v3/runEngine.server.ts
🧠 Learnings (5)
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Import Trigger.dev APIs from "trigger.dev/sdk/v3" when writing tasks or related utilities

Applied to files:

  • packages/trigger-sdk/src/v3/index.ts
📚 Learning: 2025-08-29T10:06:49.293Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/webapp.mdc:0-0
Timestamp: 2025-08-29T10:06:49.293Z
Learning: Prefer Run Engine 2.0 via internal/run-engine; avoid extending legacy run engine code

Applied to files:

  • internal-packages/run-engine/src/engine/index.ts
📚 Learning: 2024-10-18T15:41:52.352Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#1418
File: packages/core/src/v3/errors.ts:364-371
Timestamp: 2024-10-18T15:41:52.352Z
Learning: In `packages/core/src/v3/errors.ts`, within the `taskRunErrorEnhancer` function, `error.message` is always defined, so it's safe to directly call `error.message.includes("SIGTERM")` without additional checks.

Applied to files:

  • packages/core/src/v3/errors.ts
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Define tasks using task({ id, run, ... }) with a unique id per project

Applied to files:

  • docs/machines.mdx
📚 Learning: 2025-08-18T10:07:17.368Z
Learnt from: CR
PR: triggerdotdev/trigger.dev#0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-08-18T10:07:17.368Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : When triggering a task multiple times in a loop from inside another task, use batchTrigger()/batchTriggerAndWait() instead of per-item trigger() calls

Applied to files:

  • docs/machines.mdx
🧬 Code graph analysis (4)
internal-packages/run-engine/src/engine/index.ts (2)
packages/core/src/v3/index.ts (1)
  • formatDurationMilliseconds (35-35)
packages/core/src/v3/schemas/common.ts (2)
  • TaskRunInternalError (153-193)
  • TaskRunInternalError (195-195)
packages/core/src/v3/errors.ts (1)
packages/core/src/v3/links.ts (1)
  • links (1-33)
packages/cli-v3/src/entryPoints/dev-run-controller.ts (2)
packages/core/src/v3/errors.ts (1)
  • SuspendedProcessError (527-533)
internal-packages/run-engine/src/engine/index.ts (1)
  • runId (1165-1435)
packages/core/src/v3/heartbeats/manager.ts (3)
packages/core/src/v3/heartbeats/api.ts (1)
  • HeartbeatsManager (70-72)
packages/core/src/v3/heartbeats/types.ts (1)
  • HeartbeatsManager (1-8)
internal-packages/run-engine/src/engine/systems/runAttemptSystem.ts (1)
  • id (1585-1592)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (10)
packages/trigger-sdk/src/v3/index.ts (1)

18-18: Re-exporting heartbeats via v3 entrypoint looks good

This aligns with our guidance to import from "trigger.dev/sdk/v3" and keeps the API discoverable from the main SDK surface.

packages/core/src/v3/links.ts (1)

17-18: Troubleshooting link addition looks good.

The new stalledExecution doc link is consistent with existing patterns.

packages/core/src/v3/errors.ts (2)

572-574: Making pretty error messages optional is appropriate.

Callsites already guard against undefined messages; this enables link-only guidance.


579-592: OOM guidance copy reads well and points to Machines docs.

No issues from me.

apps/webapp/app/v3/runEngine.server.ts (1)

18-19: Env-plumbed flag LGTM.

Uses env wrapper and follows the existing “1” sentinel pattern.

packages/cli-v3/src/entryPoints/dev-run-controller.ts (3)

528-533: SuspendedProcessError handling is correct

Gracefully finishing on suspension avoids false failures. LGTM.


620-621: Good: detach stale handlers before reattaching

Prevents duplicate listeners when reusing a pooled process. LGTM.


53-54: State guards added

isCompletingRun/isShuttingDown help avoid re-entrancy. LGTM.

docs/machines.mdx (2)

9-18: Example looks good

Setting a larger machine preset inline in the task snippet is clear and matches the surrounding guidance.


508-523: Sanity-check memory scaling math

You scale ps %MEM by totalMemory/machineMemoryBytes. This assumes ps %MEM is relative to system RAM—which is true—but edge cases (cgroups, container limits) can skew results. Add a note or guard when machineMemoryBytes > totalMemory or when scaling factor is extreme.

@ericallam ericallam merged commit f077d49 into main Sep 12, 2025
32 checks passed
@ericallam ericallam deleted the ea-branch-87 branch September 12, 2025 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants